Explaining nestedcv models with Shapley values

#Explaining nestedcv models with Shapley values| 来源: 网络整理| 查看: 265

Explaining nestedcv models with Shapley values Myles Lewis

nestedcv provides two methods for understanding fitted models. The simplest of these is to plot variable importance. The newer method is to calculate Shapley values for each predictor.

Variable importance and variable stability

For regression model systems such as glmnet variable importance is represented as the coefficients of the model scaled by absolute value from largest to smallest. However, the outer folds of nested CV allow us to show the variance of model coefficients across each outer fold plus the final model, and hence see how stable the model is. We can also overlay how often predictors are selected in each model to give a sense of the stability of predictor selection.

In the example below using the Boston housing dataset, a glmnet regression model is fitted and the variable importance for predictors is shown based on the coefficients in the final model and tuned models from 10 outer folds. min_1se is set to 1, which is the equivalent of specifying s = "lambda.1se" with glmnet, to encourage a more sparse model.

library(nestedcv) library(mlbench) # Boston housing dataset data(BostonHousing2) dat

【本文地址】

公司简介

联系我们